MT/IE: Cross-lingual Open Information Extraction with Neural Sequence-to-Sequence Models

نویسندگان

  • Kevin Duh
  • Benjamin Van Durme
  • Sheng Zhang
چکیده

Cross-lingual information extraction is the task of distilling facts from foreign language (e.g. Chinese text) into representations in another language that is preferred by the user (e.g. English tuples). Conventional pipeline solutions decompose the task as machine translation followed by information extraction (or vice versa). We propose a joint solution with a neural sequence model, and show that it outperforms the pipeline in a cross-lingual open information extraction setting by 1-4 BLEU and 0.5-0.8 F1.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Selective Decoding for Cross-lingual Open Information Extraction

Cross-lingual open information extraction is the task of distilling facts from the source language into representations in the target language. We propose a novel encoder-decoder model for this problem. It employs a novel selective decoding mechanism, which explicitly models the sequence labeling process as well as the sequence generation process on the decoder side. Compared to a standard enco...

متن کامل

Learning for Sequence Extraction Tasks

We consider the application of machine learning techniques for sequence modeling to Information Retrieval (IR) and surface Information Extraction (IE) tasks. We introduce a generic sequence model and show how it can be used for dealing with different closed-query tasks. Taking into account the sequential nature of texts allows for a finer analysis than what is usually done in IR with static tex...

متن کامل

A Cross-lingual Annotation Projection-based Self-supervision Approach for Open Information Extraction

Open information extraction (IE) is a weakly supervised IE paradigm that aims to extract relation-independent information from large-scale natural language documents without significant annotation efforts. A key challenge for Open IE is to achieve self-supervision, in which the training examples are automatically obtained. Although the feasibility of Open IE systems has been demonstrated for En...

متن کامل

Using Information Extraction to Improve Cross-lingual Document Retrieval

We present a filtering mechanism using two cross-lingual information extraction (CLIE) systems for improving document relevance of cross-lingual information retrieval (CLIR) for queries conforming to predefined templates. Experiments on retrieving Chinese documents in response to English GALE arrest queries show that this approach can obtain a 12.7% absolute improvement in relevance (representi...

متن کامل

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017